A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
نویسندگان
چکیده
منابع مشابه
A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
We review the literature on approximate dynamic programming, with the goal of better understanding the theory behind practical algorithms for solving dynamic programs with continuous and vector-valued states and actions and complex information processes. We build on the literature that has addressed the well-known problem of multidimensional (and possibly continuous) states, and the extensive l...
متن کاملBatch Policy Iteration Algorithms for Continuous Domains
This paper establishes the link between an adaptation of the policy iteration method for Markov decision processes with continuous state and action spaces and the policy gradient method when the differentiation of the mean value is directly done over the policy without parameterization. This approach allows deriving sound and practical batch Reinforcement Learning algorithms for continuous stat...
متن کاملConvergence Analysis of Kernel-based On-policy Approximate Policy Iteration Algorithms for Markov Decision Processes with Continuous, Multidimensional States and Actions
Using kernel smoothing techniques, we propose three different online, on-policy approximate policy iteration algorithms which can be applied to infinite horizon problems with continuous and vector-valued states and actions. Using Monte Carlo sampling to estimate the value function around the post-decision state, we reduce the problem to a sequence of deterministic, nonlinear programming problem...
متن کاملAlgorithms and Bounds for Rollout Sampling Approximate Policy Iteration
Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervised learning problem, have been proposed recently. Finding good policies with such methods requires not only an appropriate classifier, but also reliable examples of best actions, covering the state space sufficiently. Up to this ti...
متن کاملexistence and approximate $l^{p}$ and continuous solution of nonlinear integral equations of the hammerstein and volterra types
بسیاری از پدیده ها در جهان ما اساساً غیرخطی هستند، و توسط معادلات غیرخطی بیان شده اند. از آنجا که ظهور کامپیوترهای رقمی با عملکرد بالا، حل مسایل خطی را آسان تر می کند. با این حال، به طور کلی به دست آوردن جوابهای دقیق از مسایل غیرخطی دشوار است. روش عددی، به طور کلی محاسبه پیچیده مسایل غیرخطی را اداره می کند. با این حال، دادن نقاط به یک منحنی و به دست آوردن منحنی کامل که اغلب پرهزینه و ...
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Control Theory and Applications
سال: 2011
ISSN: 1672-6340,1993-0623
DOI: 10.1007/s11768-011-0313-y